arXiv:1505.00074v2 [cs.CV] 25 May 2015 


IMAGE DENOISING USING OPTIMALLY WEIGHTED BILATERAL FILTERS: 

A SURE AND FAST APPROACH 

*Kunal N. Chaudhury ^Kollipara Rithwik 

^Department of Electrical Engineering, Indian Institute of Science, Bangalore, India 
^Department of Electrical Engineering, Indian Institute of Technology, Hyderabad, India 


ABSTRACT 

The bilateral filter is known to be quite effective in denoising images 
corrupted with small dosages of additive Gaussian noise. The denois¬ 
ing performance of the filter, however, is known to degrade quickly 
with the increase in noise level. Several adaptations of the filter have 
been proposed in the literature to address this shortcoming, but often 
at a substantial computational overhead. In this paper, we report a 
simple pre-processing step that can substantially improve the denois¬ 
ing performance of the bilateral filter, at almost no additional cost. 
The modified filter is designed to be robust at large noise levels, and 
often tends to perform poorly below a certain noise threshold. To 
get the best of the original and the modified filter, we propose to 
combine them in a weighted fashion, where the weights are chosen 
to minimize (a surrogate of) the oracle mean-squared-error (MSB). 
The optimally-weighted filter is thus guaranteed to perform better 
than either of the component filters in terms of the MSB, at all noise 
levels. We also provide a fast algorithm for the weighted filtering. 
Visual and quantitative denoising results on standard test images are 
reported which demonstrate that the improvement over the original 
filter is significant both visually and in terms of PSNR. Moreover, 
the denoising performance of the optimally-weighted bilateral filter 
is competitive with the computation-intensive non-local means filter. 

Index Terms — Image denoising, bilateral filter, unbiased risk 
estimator, fast algorithm. 

1. INTRODUCTION 

We consider the standard problem of denoising grayscale images that 
are corrupted with additive white Gaussian noise (DIHIll- In this 
setup, we are given the corrupted (or noisy) image 

/W =/o(i) + o-• Mti (1) 

where 7 is some finite rectangular domain of Z^, {/o(i) : i £ 7} 
is the unknown clean image, {tUi : i £ 7} are independent and 
distributed as ^^(0, 1), and a is the noise level. 

The goal is find a denoised estimate /(i) of the clean image from 
the corrupted samples. The denoised image should visually resemble 
the clean image. One way to quantify the resemblance is using the 
mean-squared-error (MSB) which is defined to be 

MSB=^^(/(l)-/o(l))^ (2) 
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Later in the paper, we will use the peak signal-to-noise ratio (PSNR) 
which is given by PSNR = 10 log]^o(255^/MSE). The image de¬ 
noising problem has been extensively studied and a survey of even 
a fraction of this literature is beyond the scope of this paper. We 
instead refer the interested reader to [D - CH and the references 
therein. 

Our present interest is in the image denoising applications of the 
edge-preserving bilateral filter imiMiini. The denoised image in 
this case is set to be 

-., _ Ejen g<^s (i) g>yr (/(i - i) - /(O) /(i - j) 
where 

= exp and (f) = exp . (4) 

The Gaussian kernels in (Ell are respectively referred to as the spatial 
and range kernels. In practice, the domain fl is restricted to some 
neighbourhood of the origin. Typically, O is a square neighbourhood, 
Q = [—lU, lU] X [—VK, lU], where W — 3as fOl . We refer the 
reader to 11311161 for a detailed exposition on the working and in 
particular the edge-preserving property of the filter. 

The bilateral filter has received renewed attention in the image 
processing community in the context of image denoising 11711181 . 
It is well-known that, while the filter is quite effective in removing 
modest levels of additive noise, its denoising performance drops at 
large noise levels (H). It was demonstrated in diiii that a patch- 
based extension of the filter can be used to bring the denoising perfor¬ 
mance of the filter at par with state-of-the-art methods. These, and 
other advanced patch-based methods (mu [To), are however much 
more computation-intensive than the bilateral filter. 

1.1. Present Contribution 

In this work, we present a couple of ideas for improving the denois¬ 
ing performance of the classical bilateral filter, and give fast algo¬ 
rithms for the same. The contributions (and the organization) of the 
paper are as follows: 

• In Section|2 we demonstrate how the denoising performance 
of the bilateral filter can be improved at large noise levels (at al¬ 
most no additional cost) by incorporating a simple pre-processing 
step into the framework. 

• The modified filter is designed to be robust at large noise levels, 
and tends to perform poorly below a certain noise threshold. To 
get the best of the either filters at all noise levels, we propose to 
optimally combine them in a weighted fashion in Section The 
optimality is in terms of a certain surrogate of the mean-squared- 
error (MSB) known as Stein’s unbiased risk estimate (SURB) 03. 




This MSE-estimate is known to be quite accurate in practice (mnoi 
ED and, as a result, the combined filter is almost always guaranteed 
to provide a lower MSE than the original filter. 

• Following 12211231 . we present an approximation for the pro¬ 
posed filtering in Section|4]that has a fast implementation. We derive 
the SURE estimator for this approximation, and demonstrate how it 
can be efficiently computed in the process of approximating the bi¬ 
lateral filter. The overall cost of computing the estimator and the 
optimally-weighted filters is about twice the cost of computing a sin¬ 
gle bilateral filtering using the fast algorithm in 1221 . 

We provide both visual and quantitative denoising results on 
standard test images in Section [5] which demonstrate that the im¬ 
provement over the original filter is significant both visually and in 
terms of PSNR. 


2. ROBUST BILATEIfAL FILTER 


Notice that the range filter in ([3} operates on the noisy samples. In 
other words, the corrupted image is used not just for the averaging 
but also to control the blurring via the range filter. What if the range 
filter could directly operate on the clean image? It can be verified 
that the denoising image obtained using this “oracle” filter is visibly 
better and has higher PSNR, which is not surprising. Of course, we 
do not have access to the clean image in practice, and thus the oracle 
bilateral filter cannot be realized. Nevertheless, one could consider 
some form of proxy for the clean image. Our proposal is simply 
to use the box-filtered version of the noisy image as a proxy. In 
particular, we propose the following robust bilateral filter (RBF): 


/2(l) = 


Ejgn 9^s (j) 5 <t,(/(i - j) - /(i)) /(i - j) 


Ejen U) gcTrifii - j) - / W) 


(5) 


where 


/(i) 


1 

(2L-bl)2 


je{-L,i}2 


-j)- 


( 6 ) 


The amount of smoothing induced by the box filter is controlled by 
L. We performed exhaustive simulations in this direction. The sim¬ 
ulation results suggest that L = 1 (3 x 3 blur) is optimal for most 
settings. A possible way to explain this is that this small blur is able 
to suppress the noise without excessively blurring the image features. 
The denoising results obtained using the standard and the robust fil¬ 
ter a couple of test images are shown in Table [D We note that the 
filters have been independently tuned with respect to CTs and ar to 
get the best PSNR. These results (and the results on other test im¬ 
ages that are not shown here) suggest that the robust filter starts to 
perform better beyond a certain noise level depending on the type of 
image. This is not surprising since, at low noise levels, the box fil¬ 
tering introduces more blurring than noise suppression which brings 
down the overall signal-to-noise ratio. Indeed, the corrupted image 
is already a good proxy for the clean image when the noise floor is 
small. On the other hand, notice that the improvement in PSNR is 
often as large as 10 dB at large noise levels. 


Table 1. Comparison of the standard bilateral filter (SBF) and the 
robust bilateral filter (RBF) at cr = 10, 20, 30,40, 50, 60. For a fixed 
image and noise level, we tuned as and ar to individually optimize 
the PSNR obtained using both methods. 


Image 

Filter 



PSNR (dB) 



House 

SBF 

33.76 

29.88 

25.48 

21.44 

18.27 

15.83 

RBF 

33.15 

31.35 

29.85 

28.33 

27.23 

26.27 

Peppers 

SBF 

32.94 

28.97 

24.88 

20.86 

17.89 

15.56 

RBF 

31.30 

29.73 

27.92 

26.31 

25.17 

24.27 


Notice that this includes 13 and 13 as special cases. 

This bring us to the question of setting the weights 9i and 62 . 
This can hypothetically be done by minimizing the MSE between 
0 and the clean image. However, we do not have access to the 
clean image. This is precisely where the classical Stein’s unbiased 
risk estimate (SURE) is useful. This is given by 


SURE = ^ ^ (/(i) - /(i))" -a^+ 

l€J IGI 


|/| 


dM 
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■E 


(8) 


SURE has the property that its expected value equals that of 0 for 
the Gaussian noise model in 0 (ill- This makes it an useful sur¬ 
rogate for the MSE, which can be computed without the knowledge 
of the clean image. We note that the idea of taking linear (or affine) 
combinations of estimators and tuning the weights to get the optimal 
SURE has been tried in different contexts in the literature; e.g., see 
ED. Note that the added computation required for SURE are the 
computation of the partial derivatives in 0. 

We propose to find the optimal weights in 0 by minimizing 
SURE. In particular, on substituting 0 in 0, we see that SURE is 
quadratic in 61 and 62 . Thus, by convexity, a necessary and sufficient 
condition for optimality is that that the gradient of 0 must vanish at 
the optimal weights. In particular, it can be verified that the resulting 
gradient equations can be written as = b, where 


and 


A = 


b = 


E,sr/iW^ 

E,6//i(i)/2(i) 


E,6//i(i)/2(i) 


E,e//(i)/i(i) -cr^E,e/ft/i(i) 


(9) 


( 10 ) 


vE,e//(i)/2(i)-cr E,e/9i/2(i)z 

and 0 "^ = {of, Of) are the optimal weights. To simply nota¬ 
tions, we will henceforth use the shorthand d, in place of the op¬ 
erator d/df{i) appearing in 0. In summary, by computing A and 
b, and solving the 2x2 linear equation = b, we get the opti¬ 
mal weights. The weights are then plugged into 0 to get the final 
denoised image. 


3. OPTIMALLY WEIGHTED BILATERAL FILTERS 

The results in Table [D suggest that we could possibly perform bet¬ 
ter denoising by combining 0 and 0. A particularly simple idea 
would be to take a linear combination of these estimates. That is, we 
could set the denoised image to be 

/(l) =0 i/i(i)+02/2(i) (lel). (7) 


4. FAST AND SURE IMPLEMENTATION 

The cost of computing 0 and 0 is clearly 0 {a 1 ) per pixel, since 
the support of the spatial filter is proportional to af Moreover, we 
are also required to compute the derivatives for SURE which would 
further add to the cost. We now explain how we can compute 0, 
0, and the derivatives and using the fast algorithm 

in E3E3- It was observed here that, for sufficiently large N, we 









can accurately approximate the range kernel in 0 using 


cos 



N 

Cn exp (lUJnt) , 

n=0 


where i denotes \/—l, 


Cn — 


1 


2iv 



and LJn 


{2n - N) 

ar'/N 


( 11 ) 


( 12 ) 


Plugging ([TTJ into 0, using the multiplication-addition property of 
exponentials, and exchanging the summations, the numerator of 0 
can be expressed as 


N 

P(i) = 

71 = 0 

where Hn{i) = exp(-ta;„/(i)), F„(i) = and _F„(i) 

denotes the Gaussian filtering of Fn: 

F„{i) = j) (13) 

Here and later, we use z* to denote the complex conjugate of z. Sim¬ 
ilarly, the denominator of 0 can be expressed as 


N 

Q(i) = ^c„H4i)G4i), 

n=0 


where Gn(i) = H*{i) and Gn(i) is as defined in l ll3l l. In summary, 
we can approximate 0 using 


/i(i) = 


PW 

w 


(14) 


We note that the same notation has been used for both 0 and its 
approximation (EJ. The computation of d is clearly dominated 
by the computation of a series of Gaussian filtering. Now, each of 
these filtering can implemented in constant-time (for any arbitrary 
a a) using separability and recursions HU- We have thus been able 
to cut down the per-pixel complexity from 0(aa) to 0(1) using l ll4t . 
On the other hand, note that we can write 


dji(i) = ^ (d,P(i) - /i(i) d,Q(i)) . 

After some calculation and simplification, we can verify that 

N 

aP(i) = <?a40) (15) 

71 = 0 


aQ(i) = -jy]cna;„P„(i)G„(i). (16) 

71 = 0 

To arrive at the above formulas, we have use the fact that the sum of 
(cn) is 1 and that of {cnLo„) is 0. These identities are obtained by 
evaluating the both sides of (ID and its derivative at f = 0 . 

The important point here is that some of the quantities that are 
used for computing dl are reused to compute the partial derivatives 
in GD and l ll 6 t . The steps of the computation are summarized in 
Algorithm[D It is clear that the main computations are the Gaussian 
filtering in steps 12 and 13. That is, the overall cost is dominated 


Data: Noisy image /(i), and parameters CTs, cr^, N. 
Result: /i(i), diP{i), and d,Q{i). 

1 P{i) = 0; 

Qii) = 0; 
aP(i) = 0; 
aQ(i) = 0; 

c = 2-^, v = l/{arVN); 
for n = 0 , 1 ,..., do 

c = c{N — n)/(n -|- 1); 
uj = (2n — N)v, 

H{i) = exp(-ia;/(i)); 

G(i) = P*(i); 

P(i) = G(i)/(i); 

P(i)=cP(i)(P* 5 ,J(i); 
G(i)=cP(i)(G*5.J(i); 

P(i) = P(i) + P(i); 

Q(i) = Q(i) + G(i); 
aP(i) =aP(i)-fa;P(i); 
diQ{i) = aQ(i) -I- UJ G(i); 

18 end 

19 /i(i)=P(i)/g(i); 

20 aP(i) = gcTa{0) — *aP(i); 

21 d,Q{i) = —idiQ{i)-, 


Algorithm 1: Computation of (I14t. lll5t. and lll 6 t. 


by the cost of computing 2{N + 1) Gaussian filtering. Notice that 
we have recursively computed the binomial coefficients in GD, and 
we have introduced temporary variables to cut down some of the 
redundant computations. 

We can proceed similarly as above to approximate 0 and com¬ 
pute the associated derivatives. In particular, we can approximate 0 
as / 2 (i) = P(i)/S(i), where 

N N 

P„(i)I4(i) and S(i) = y]c„P„(i)l4„(i). (17) 

71 = 0 71 = 0 


Here U„{i) = exp(-ia;„/(i)), Wn{i) = and 14 ( 1 ) = 

H4(i)/(i). On the other hand, after some calculation, we find that 

a/2(l) = ^ (4P(l) - /2(l) aS(l)) , 


where 


N 

9.P(i) = g^M - ' y] c„cc„P4i)K(i), (18) 

' 71 = 0 


' y] (19) 

' 71 = 0 

Notice that l ll 8 t and H9t are respectively identical to H5I and 1 II 6 I 1 
except for the additional 1/(2P +1)^ term. This term appears owing 
to the fact that both R(i) and S ( 1 ) are functions of f{i). In particular, 
we get this term from the application of the chain rule and the fact 
that dif{i) = 1/(21/ + 1)^. Needless to say, the algorithm for com¬ 
puting the quantities in GD, GEi, and H9t is similar to Algorithm[D 
The complete Matlab implementation can be found here CD. 










Table 2. Comparison of the Standard Bilateral Filter (SBF), the 
Robust Bilateral Filter (RBF), the Weighted Bilateral Filter (WBF), 
and the Non-Local Means (NLM) filter |6| at noise levels a = 
10,15, 20, 25, 30, 40, 50, 60. We used the following test images 
(26l : Boat (B), Lena (L), House (H), Peppers (P), and Cameraman 
(C). 


Image Filter 




PSNR (dB) 





SBF 

32.02 

29.87 

28.44 

26.84 

24.86 

21.21 

18.20 

15.76 


RBF 

29.95 

29.51 

28.90 

28.17 

27.46 

26.42 

25.56 

24.84 

B 

WBF 

32.31 

30.44 

29.27 

28.33 

27.58 

26.50 

25.60 

24.86 


NLM 

31.93 

29.93 

28.57 

27.6 

26.90 

25.68 

24.58 

23.84 


SBF 

33.61 

31.61 

30.07 

27.97 

25.59 

21.60 

18.39 

15.83 


RBF 

33.30 

32.48 

31.49 

30.59 

29.84 

28.60 

27.63 

26.76 

L 

WBF 

34.31 

32.75 

31.56 

30.64 

29.87 

28.62 

27.64 

26.76 


NLM 

34.06 

32.33 

30.99 

29.89 

29.07 

27.74 

26.72 

25.85 


SBF 

33.76 

31.54 

29.88 

27.77 

25.48 

21.44 

18.27 

15.83 


RBF 

33.15 

32.34 

31.35 

30.56 

29.85 

28.33 

27.23 

26.27 

H 

WBF 

34.40 

32.66 

31.53 

30.63 

29.90 

28.35 

27.24 

26.27 


NLM 

34.63 

33.00 

31.63 

30.56 

29.34 

27.69 

26.36 

25.00 


SBF 

32.94 

30.71 

28.97 

27.01 

24.88 

20.86 

17.89 

15.56 


RBF 

31.30 

30.60 

29.73 

28.79 

27.92 

26.31 

25.17 

24.27 

P 

WBF 

33.38 

31.29 

29.95 

28.88 

27.97 

26.33 

25.20 

24.30 


NLM 

32.91 

30.71 

29.23 

28.06 

27.14 

25.51 

24.40 

23.35 


SBF 

32.66 

30.20 

28.55 

26.80 

24.77 

21.16 

18.12 

15.59 


RBF 

27.57 

27.34 

26.98 

26.45 

25.87 

25.00 

24.28 

23.59 

C 

WBF 

32.69 

30.28 

28.61 

27.25 

26.36 

25.28 

24.41 

23.66 


NLM 

32.61 

30.00 

28.57 

27.70 

27.01 

25.39 

24.28 

23.22 


5. EXPERIMENTS 

We now present some denoising results on standard test images. Our 
objective is to compare the denoising results obtained using the pro¬ 
posed modification with the standard bilateral filter, both visually 
and in terms of PSNR. In this work, we assume that a is provided; 
this has to be estimated in practice from the noisy image. In this re¬ 
gard, we note that it has been demonstrated in I20II21I that the SURE 
is quite robust to standard data-based estimates of cr. We remark that 
one can optimize the SURE for the proposed filter with respect to 
a a and Or as done in 12011211 ; however, we do not investigate this 
possibility in this paper. 


Table 3. Comparison of the average run times of the direct and 
the fast implementation of the weighted bilateral filter for different 
(cTs, (Tr). We used Barbara (512 x 512) for the comparison and set 
a = 20. The implementation was done using Matlab on an Intel 
quad-core 2.7 GHz machine with 8 GB memory. For both imple¬ 
mentations, we took the support of the Gaussian to be VU = 3aa- 


(2,15) 

(4,20) 

(3,25) 

(5, 30) 

(3,35) 

(4,40) 

Direct 32s 

120s 

70s 

185s 

74s 

125s 

Fast L2s 

0.8s 

0.7s 

0.6s 

0.5s 

0.6s 



(a) Cameraman (256 X 256). (b) Comrpted: 18.61 dB (cr = 30). 



(c) SBF: 24.76 dB; (2, 45). (d) WBF: 26.38 dB; (4, 20). 


Fig. 1. Denoising results using standard bilateral filter (SBF) and 
the proposed weighted bilateral filter (WBF). For reproducibility, we 
used the builtin Camerman image that comes with Matlab. We tuned 
the parameters of SBF and WBF to get the optimal PSNR in either 
case. The optimal (aa, etr) setting is indicated in the caption. 


noise level. As remarked earlier, the robust filter starts to perform 
better than the standard filter beyond a certain noise level (ct ~ 20). 
Notice that the PSNR obtained using the optimally-weighted filters 
is consistently higher than that of the constituent filters at all noise 
levels; the improvement is often as large as 1 dB. On the other hand, 
the improvement in PSNR over the standard bilateral filter is often 
as high as 10 dB. This does not come as a surprise since it is well- 
known that the bilateral filter has a lot of scope of improvement at 
high-noise levels CD. However, what does come as a surprise is 
that proposed filter is competitive with the non-local means filter @ 
that uses patches (groups of neighboring pixels) instead of single 
pixels for denoising. For a visual comparison, the results of a par¬ 
ticular denoising experiment are shown in Figure [T] Notice that the 
denoised image obtained using the proposed filter looks much more 
sharp than that obtained using the standard filter, and has a signifi¬ 
cantly higher PSNR. Also, notice that the noise in the background 
is much less in (d) compared to (c). In Table [3 we compare the run 
times of the direct and the fast implementation of the proposed filter 
for different parameter settings. Notice that the fast implementation 
is significantly faster than the direct implementation. 


In Table|2 the denoising performance of the proposed weighted 
filter at different noise level is compared with the standard and the 
robust bilateral filter, as well the patch-based non-local means filter 
111 . We have used L = 1 in HJ for all the experiments. To have a 
fair comparison for a given image and noise level, we independently 
tuned Oa and Or to optimize the PSNRs of the respective filters. We 
also tuned the parameters of NLM to get the optimal PSNR at each 


6. CONCLUSION 

We demonstrated how a simple pre-processing step can substantially 
improve the denoising performance of the bilateral filter. To con¬ 
sistently get the best of the standard and the pre-processed filter at 
all noise levels, we proposed to optimally weight them using Stein’s 




















unbiased estimate of the MSE. The optimal weights were found hy 
solving a small linear system. A fast algorithm for implementing the 
optimally-weighted hlters was also described. We reported visual 
and PSNR results on test images which confirmed the improvement 
over the original bilateral filter. An interesting finding was that the 
weighted bilateral filter is competitive with the non-local means fil¬ 
ter. 
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